Introduction

We are going to reproduce three different plots, of which we are only given the data set and the final image file. Each of them has two versions:

  • The first version is a plot made just using the layer functions of ggplot2 and leaving everything else with the defaults. This is, using ggplot, geom_*, stat_* and annotate_* functions (click on the image to see it bigger).

random image random image random image

  • The second version builds from the first plot to create a more complex display with customized scales_*, coord_*, facet_* or theme_*. Helper functions such as labs() or guides() could also be used (click on the image to see it bigger).

random image random image random image

library(ggplot2)

: ggplot2 will automatically load all three data sets used in this practical.

Organization of the practical

You will see different icons through the document, the meaning of which is:

: additional or useful information
: a worked example
: a practical exercise
: a space to answer the exercise
: a hint to solve an exercise
: a more challenging exercise


Simple versions

Plot 1

# Dataset used: diamonds
library(ggplot2)
data <- data("diamonds")

ggplot(diamonds, aes(clarity, log10(carat))) + 
  geom_jitter(mapping = aes(clarity, log10(carat), colour = cut)) +
  geom_violin(fill = "black", color = "black") +
  labs(x = "Clarity", y = "log10(Carat)", color = "Cut")

Answer:

Plot 2

library(ggplot2)
ggplot(mpg, aes(manufacturer, fill = class))+
  geom_bar() +
  geom_text(stat = "count", aes(label = after_stat(count)), 
            position = position_stack(vjust = 0.5)) +
  theme(axis.text.x = element_text(angle = 45, hjust = 1))

Need help with the count numbers shown in each stacked bar? https://stackoverflow.com/questions/6644997/showing-data-values-on-stacked-bar-chart-in-ggplot2

Answer:

Plot 3

# Dataset: ToothGrowth
ggplot(ToothGrowth, aes(factor(dose), len))+
  geom_boxplot(mapping = aes( fill = supp))+ geom_hline(yintercept = mean(ToothGrowth$len)) + geom_text(x = 1,y = 20, mapping = aes(label ="Mean"))
## Warning in geom_text(x = 1, y = 20, mapping = aes(label = "Mean")): All aesthetics have length 1, but the data has 60 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
##   a single row.

Answer:


Complex versions

Plot 1

ggplot(diamonds, aes(x = clarity, y = log10(carat), color = cut)) +
  geom_jitter(mapping = aes(alpha = 1)) + 
  geom_violin(aes(fill = cut), scale = "width", fill = "black", colour = "black") + 
  facet_wrap(~cut, nrow = 1) +  # Facet by cut
  labs(title = "Diamond quality", y = "log10(Carat)", x = "clarity")+
  theme(legend.position = "None",axis.text.x = element_text(angle = 45, hjust = 1, size = 6),)

Answer:

Plot 2

ggplot(mpg, aes(manufacturer, fill = class)) +
  geom_bar() +
  geom_text(stat = "count", aes(label = after_stat(count)), 
            position = position_stack(vjust = 0.5), color = "black") + 
  coord_flip() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
  labs(x = "Manufacturer", y = "Number of Cars", fill = "Car Type") +
  scale_fill_brewer(palette = "Set2")

Answer:

This graph uses a scale_fill_brewer palette, can you find which one?

Plot 3

ggplot(ToothGrowth, aes(factor(dose), len))+
  geom_boxplot(mapping = aes( fill = supp))+ 
  geom_hline(yintercept = mean(ToothGrowth$len)) + 
  geom_text(x = 1,y = 20, mapping = aes(label ="Mean")) +  
  labs(x = "Dose (mg/day)", y = "Tooth length", fill = "Supplement", caption = "Source: ToothGrowth dataset") +
  scale_fill_manual(
    values = c("OJ" = "#FFA500", "VC" = "#FFCC80"),  # Custom colors
    labels = c("OJ" = "Orange Juice", "VC" = "Vitamin C" )) 
## Warning in geom_text(x = 1, y = 20, mapping = aes(label = "Mean")): All aesthetics have length 1, but the data has 60 rows.
## ℹ Please consider using `annotate()` or provide this layer with data containing
##   a single row.

Answer:

This graph uses a scale_fill_brewer palette, can you find which one?